Hi, I’m working out how to use the Events V2 API to address the following use case:
- clustered service running in AWS
- instances can trigger alerts which can be specific to a sub-function and tenant. I would use these to construct a dedup key for the alert, e.g. PERSISTENCE-TENANT1.
- other instances are likely to trigger the same alert, would like them all to be aggregated under a common incident as separate alerts. I thought the dedup key would do that for me, but it appears not.
- someone is assigned the incident, and restarts the DB.
- as each instance detects that the DB is now fine, I would like them to be able to clear the particular alert that each one raised.
- once all the alerts have been resolved, then the incident resolves.
The problem is that I don’t see any way to use the API to get the alerts grouped under a common incident, based on their sub-function and tenant. If I use the dedup key constructed as above, then I get separate events in the log for a single alert. But these events are not useful in terms of being able to resolve them individually.
Any suggestions how to achieve this use case, which after all seems pretty vanilla for a highly-available service running in the cloud?